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We construct a Banach manifold of states, which are Gibbs states 



1 Introduction 



The ultimate goal of the present project is a quantum version of the theory 
of information (or statistical) manifolds; in classical probability this circle of 
ideas has been rather successful in many fields from estimation theory to dis- 
k> i sipative dynamics in neural networks Q . We were inspired by the nice work 

^ '_ of Pistone and Sempi, who put the classical theory on a firm mathematical 

foundation in the infinite-dimensional case p^ . While our ambitions are 
the same as [53], our results are not as complete, and the technical prob- 
lems, arising from the non-commutative nature of the potentials, are quite 
different. The problem first arose in the quantum theory of many-body sys- 
tems at 'finite' temperature, in the works of Matsubara, Mori and Kubo 



[P5| , 26, |2^. There, we find the correlation functions for observables written 
as imaginary-time-ordered products of operators. The mathematical theory 
was advanced by Bogoliubov, P who showed that the two-point correlation 
function was real and positive-definite. In the geometrical context, this is 
thus a Riemannian metric on the vector space of Hermitian operators; it is 
often known as the Bogoliubov-Kubo-Mori, or BKM, metric. 



Parametric estimation theory starts with a family of probabihty distri- 
butions, and also the data which give us a distribution in the form of a 
histogram. We seek the best representation the data; this is done by finding 
the member of the family that minimises the 'distance' to the distribution of 
the data; here, which concept of distance to be used is one of the problems. 
Gauss used the Euclidean distance, giving his famous least-squares fit to 
the data. This distance, however, depends on the coordinates used. Fisher 



[11 1 introduced the information matrix, which is a tensor under change of 
coordinates, and this was developed into a Riemannian metric tensor on the 
manifold of parametrised distributions by Rao |34]. Dawid |1C] realised that 



the theory also needed an affine connection, and that this did not have to be 
that of Levi-Civita. Ideas from information theory were then incorporated, 
and the 'distance' to be minimised turned out to be the Kullback-Leibler rel- 
ative entropy. The poetic geometry involving the dual affine structures was 
then put together by Amari, in a notable book [Q]. The manifold of states 
is determined by a chosen subspace X, spanned by (linearly independent) 
random variables {Xi, . . . ,Xn}. It can be parametrised by the canonical 
coordinates ^^ , ^ G V, where y is a convex cone in R*^. Let X = ^^■'Xj; 
the exponential family determined by X is the set of states of the form 

p^ = Z-i exp{- ^ eXj} = exp -{X + ^{0}. (1) 

The canonical coordinates are (inverses of) generalised temperatures [ p!9| ] 
and the Massieu function ^ = log Z is the thermodynamic potential. The 
(+l)-affine structure comes from forming convex mixtures of the ^^; that is, 
the mixtures of p^^ and Py are states of the form P^^.^iy ■ Iii this paper, 
A' will denote 1 — A > 0. Since the S, are global affine coordinates, this 
affine structure is flat and torsion-free. The Legendre dual to ^ is the 
entropy, written as a function of the probability, and it also plays the role 
of a thermodynamic potential, The dual coordinates, 

are the expectation values of the random variables Xi in the state p^ . 

The rji are also global coordinates, and the (— l)-affine structure is de- 
fined by forming convex sums of these coordinates; it is therefore also flat 
and torsion-free. It coincides with the usual convex mixture of states, and 
so is called the mixture affine structure. Each of these affine structures de- 
fines a concept of parallel transport of vectors in the tangent space; neither 



of these affine structures is metric invariant, but they are duahy related by 
the metric. Amari also interpolates between these, to get the family of (a)- 
afiine structures, of which a = is self-dual and therefore metric: it is the 
Levi-Civita affine structure. 

It has been remarked [^, ^, 18, ^, |2|, ^ that several dissipative models 



used in neural nets and physics can be expressed as the projection or rolling 
of a linear dynamics onto the surface given by a family of distributions. 
The random variables whose distributions form the manifold are taken to 
be the slow variables of the theory; the other variables are thermalised by 
the projections that keep the dynamics on the manifold. 

The mathematical legitimacy of the procedure was strengthened by the 
studies of Chentsov |^] . He had the idea of regarding stochastic maps as the 
natural morphisms between statistical structures. The Fisher-Rao metric is 
contracted by any such map; moreover, it is the only Riemannian metric (up 
to a constant factor) to have this property. In physics, dissipative dynamics 
is given by a semigroup of stochastic maps, and the contractive property 
is the expression of convergence to equilibrium at large times. These are 
necessary properties of any good theory. Thus there is a certain uniqueness, 
in the classical case, of the geometry of parametric families. Chentsov re- 
marked that this is not the case in the geometry of quantum information 
manifolds in n dimensions. This was studied by Hasagawa, Nagaoka and 
Petz in, H, H, m, 0, 13, |9|. See also [||]. The set of faithful density 



matrices is a manifold A4 of dimension n? — 1. The morphisms are taken 
to be stochastic completely positive maps. Chentsov showed that there are 
many metrics on the tangent space T of A^ that are contracted by these 
morphisms. One can identify T at p with the linear space of Hermitian 
matrices with zero expectation in the state p. This is the quantum analogue 
of the 'centred random variables' that make up the tangent space in +1 
coordinates in the classical case. Chentsov's work on the possible metrics 
was completed by Petz pO[ in the case of finite dimensions. Examples of 
these can be found in Roepstorff |^]. Hasagawa, and Nagaoka, in particular 
emphasise two important cases. These are the symmetric KMS metric, and 
the BKM metric. 

Given a faithful density matrix, p, the KMS metric on the vector space 
of n X n Hermitian matrices is constructed from the complex scalar product 
(X, y) = Tr {pX*Y) by taking the real part. The KMS metric has been 



used in quantum estimation theory |41, 0], and it coincides with the Study- 



Fubini metric on the projective sphere when restricted to the pure states. I 
have also used it extensively in p^]. In spite of this, it seems that the BKM 



metric is better. It is defined on the space of centred operators X = X — p.X 

by 

(y,X) := Tr / p^Yp^ XdX (3) 

Jo 

Here, and elsewliere in tfie paper, A' = 1 — A. First, in tlie quantum case, tlie 
(ib)-affine structures are not dual relative to the iTM^S" metric. Alternatively, 



one can take, as in [27|, the mixture affine structure as a start; then its dual 
relative to the KMS metric is not flat and torsion free. It follows that there 
do not exist dual potentials, related by a Legendre transform, corresponding 
to the physically important objects, the entropy and the Massieu function. 
The second reason why I now prefer the BKM metric is mathematical; the 
BKM metric is smaller than the KMS metric; the latter does not exist in 
general for unbounded operators, and certainly not for forms. The math- 
ematical stylishness of the BKM version of the information manifold is so 
compelling that perhaps the extensive work on quantum estimation should 
be redone with the BKM ra.eii\c replacing the KMS metric. 

In the classical case, Pistone and Sempi ||3^, |IJ] introduce information 
manifolds in general, not parametrised by a finite number of slow variables. 
Thus they are in the field of nonparametric estimation theory. However, 
they must start somewhere, and they fix a basic underlying measure space, 
whose measure p need not be finite, but is used to specify the null sets. The 
probability measures of the manifold are then those smoothly related to the 
given one. The present paper is an attempt to get a nonparametric version 



in the quantum case. It follows up earlier work |37], in which the Hilbert 
space was of infinite dimension, but the manifold was of finite dimension, 
corresponding to limiting our attention to finitely many 'slow' variables. In 
the quantum case, we need a trace, not necessarily finite, to play the role of 
p; we need a density matrix to play the role of p; this is provided by a 'free 
Hamiltonian' Hq, a positive selfadjoint operator with domain T>{Hq), on a 
Hilbert space H, such that there exist /3o > with 

po = Zq e^^ " is a density operator for all (3 > (3q. (4) 

This condition holds for the harmonic oscillator, and also for the Lapla- 
cian in a compact region in R", with smooth boundary and Dirichlet bound- 
ary conditions, or in a rectangle in R" with periodic boundary conditions. In 
all these examples, /3o = 0. The condition corresponds to a thermodynam- 
ically stable system in a finite box, in which there are no phase transitions 
for (3 > (3q. The zero-point energy of Hq has no significance, as the addition 



of a constant to Hq is cancelled by the corresponding change in Z; we may 
therefore assume that Hq > I. By scaling, we may assume that Pq < 1, and 
we start with a state of the form 

po = e-(^o+i'o) (5) 

Here, ^o = logZo. The condition given by eq. (^ is enough to allow the 
quantum analogue of the 'Cramer class' of random variables u arising in the 



classical case [33|. 



In §(2) we shall construct our first patch of the manifold. It will be 
a set of states related to the basic state po by a small form perturbation 
of Hq. a form is a bilinear real map ip,(p >-^ X{(p,ip) G R, where ip runs 
over the form-domain Q{X). For example, the positive selfadjoint operator 

1/2 

Hq defines the quadratic form qq with form domain Qq = D{Hq ) by the 
definition 

9o(¥','/') = (HQ^'^^p^H^^'^ip). (6) 

The theory of small forms allows us to write the operator Hx as the unique 
selfadjoint operator satisfying 

{H]/'^ip, H]l'^ip) = qQ{ip, ip) + X{ip, <p), (7) 

for all ip ^ Qq. The perturbed state 

px = ^x' exp - {Fx } = exp - {Fx + ^x } (8) 

is shown (lemma 4) to exist provided that the form hound of X is smaller 
than 1 — /3o; it inherits most of the good properties of pQ. The forms X 
which are g^o'tounded give us the Cramer class at pq. 

States of the form px are the canonical states for the Hamiltonian 
Hq + X. The case of bounded perturbations has been extensively analysed 
by Araki Q. In linear response theory such states are thought of as the 
equilibrium state reached in response to an external field, whose 'effective 
potential' is X. We do not insist on this interpretation; for example, in the 
version of non-equilibrium statistical mechanics called 'statistical dynamics' 



[38 1 we regard px as a nonequilibrium state parametrised by X. Ingarden 
[18| has called the possible X the 'slow variables'; Jaynes calls them the 
accessible variables, in line with his subjective view of entropy. The point 
of working in infinite dimensions is to have a space of states large enough 
to contain the dynamics, so that the 'reduced description' can be given a 



geometric flavour as the projection from the full manifold to a submanifold 
described by the means of the variables of interest. 

The parametrisation of the perturbed states is established by an excur- 
sion into the theory of sesqiforms. We show that if X is relatively form- 
bounded, then the expectation value po-X = Tr (pqX) can be given an 
unambiguous meaning, and is continuous in X when the space of relatively 
bounded forms is provided with with a natural norm, in which it becomes 
a Banach space, ^(0). A form obeying pQ.X = is said to be centred. The 
subspace T(0) C T(0) given by centred variables then defines a closed sub- 
space; the open ball of radius 1 — /?o in '^(O) is in bijection with a collection 
of states of the form eq. (P). This is our first patch of the information mani- 
fold, i.e. mapping from the set of states into the open unit ball of a Banach 
space. 

In §(3) we develop analysis with the first patch; in particular, we derive 
the Duhamel formula for the difference of two states of the form eq. (g) in 
terms of integrals of sesqiforms. 

In §(4) we look at the two affine structures (±1) in the first patch. 
Parallel transport in the (+1) structure is easy to define; but the (— l)-afiine 
sum, which is ordinary mixture of states, may lead outside the manifold. 

In §(5) the manifold is extended by adding overlapping patches based on 
points in already established patches. The key here is that a perturbed state 
px inherits all the good properties of po, and that the norms on overlapping 
patches are equivalent. That we should be able to do better than just the 
first patch is easy to understand; if X = {1/2)Hq, then X is i?o-small, and we 
can define Hx = Hq + X = {3/2)Hq. Then an operator Y can be i^x-small 
without X + Y being i:fo-smaU. So we define Hq + X + Y = {Ho + X)+Y in 
two stages. In this case we get the same operator for P{Hq + X + y ) as if we 
use 3/2(3{Hq + 2/31"), so in this case we could get there in one step from a 
state of different temperature. However, in general we expect to reach new 
states, not obtainable in one step. In this way we build up our manifold 
of states, reachable from po in a finite number of steps. It is clear that 
the whole space of i^o-bounded operators cannot be reached from the state 
of given beta; for X = —Hq will never be reached; roughly, the manifold 
we construct lies in the direction of +Hq. The (+1) affine structure and 
parallel transport can be extended to the whole manifold, which is proved 
to be convex. 



2 Sesquiforms and Perturbation Theory 

In this section we extend the definition of perturbed states, beyond that 
considered in [37|, in two ways. First, we allow X to be a quadratic form, 
bounded relative to the quadratic form qq. This means the following. Let 
X be a sesquilinear form defined on Qq, it is said to be go-bounded if there 
exist numbers, a, b such that 

|X(^,V)| <ago(V',V')+&ll^f foranVGQo- (9) 

If a can be chosen less than 1 (by a good choice of b), then we say that X 
is Q'o-small. In our case, we shall need to choose a smaller than 1 — Pq. If a 
can be chosen to be arbitrarily small, we say that X is go-finy- 

It is not hard to show that D(Hq) C Qq. Any HQ-suiall operator is also 
go-smah H], ThX.18. 

The second extension of [^^ is that we consider simultaneously the set of 
all go-bounded forms, and provide them with a norm; we obtain a parametri- 
sation of the space of perturbed states by X of small norm by the open unit 
ball of a Banach space: this is our first patch of the manifold. 

2.1 Sesquiforms 

We can give a meaning to left and right products of quadratic forms by cer- 
tain operators. Suppose that g is a quadratic form with domain Q{q)] then q 
defines a sesquilinear form q{<j), -0) by polarisation, with domain Q{q) x Q{q). 
Let A, B be operators such that A* and B are densely defined, A* taking 
D{A*) into (5(g), and B mapping D{B) into Q{q). Then by the expression 
AqB we mean the sesquilinear form given by 

<p,ip^q{A*cp,Bij), (t)eD{A*), tjjeD{B). 

It is obvious that the product is associative: {AB)q = A{Bq). More gener- 
ally, given dense linear sets Pi, 1^2, a sesquilinear map from T>i x 252 to C 
will be called a sesquiform. A sesquiform is not required to be symmetric. 
The 'formal adjoint' of the sesquiform g is the form g* with domain D2 x Di, 
and given by 

g*((^,V)=^(V^^. (10) 

We note that the restriction of a sesquiform to a pair of dense linear sub- 
spaces of Vi and T>2 is also a sesquiform, and that sesquiforms with the 
same domains can be added to give a sesquiform. 



Definition 1 A sesquiform q will be said to be bounded if there exists C 
such that 

1^(99,^)1 < C||<y9|| IIV'II holds on the domain. 

Lemma 2 Suppose that X is a q^-bounded symmetric form defined on Qq. 

Then Rq XRq is a bounded symmetric form defined everywhere. Con- 

1/2 1/2 
versely, if X is a symmetric form with domain Qq, and Rq XRq < 1, 

then X is qo-small. 

PROOF. 

Recall that we have normalised Hq so that Hq > I; then Rq = Hq is 
bounded (by 1). It is known that Rq maps H onto Qo, and so Rq XRq 
is everywhere defined. So, suppose that X is (70-bounded. Then 



,l/2„„l/2. 



R'Q''XR'Q'\^P,^P) = X(i?^/>, i?^'» 



,1/2 ,, „l/2^ 






so Rq XRq' is bounded. 

For the converse, assume that X is a quadratic form with domain Qq, 

II 1/2 1/2 II 1/2 

and that a = \\Rq XRq \\ < 1, and let ip = Rq ip be any element of Qq. 

Then 

X(^,V) = {ip,Rl^^XR'J'^)<ayf 
= aqoii^,^;). 

Hence X is go-small, with 6 = 0. □ 

The Kato-Rellich theory can be extended to forms. The key is the KLMN 
theorem, (p5[], Vol. 2, pl67) which we give here in weaker form. 



Theorem 3 Let Hq be a positive self-adjoint operator, with quadratic form 
qo and form domain Qq; let X be a qo-small symmetric quadratic form. 
Then there exists a unique self-adjoint operator Hx with form domain Qq 
and such that 

{H'/if,H'/i;)=qQ{^,i;)+X{^,ij), ^,4^ e Qq. (11) 

Moreover, Hx is bounded below by —b, and any domain of essential self- 
adjointness for Hq is a form core for Hx ■ 



Now let X be go-small, with bounds a, b, with a < 1 — Pq. Denote by Hx 
the unique operator given by KLAIN; let qx denote its form. Hx inherits 
the main property of Hq, thus: 

Lemma 4 exp(— /3iJx) is of trace class for all j3 > (5-^ = /3o/(l — o)- 

PROOF 

We have, as quadratic forms on Qo, the inequalities 

-hI+{l~a)qQ<qx<hI+{l + a)qQ. (12) 

Let L be any finite-dimensional subspace of Qq, and let q stand for go or 
qx- Put 

A(g, L) = sup{g(V, ^) : ||^|| = 1, V € L}. (13) 

Then the ordered eigenvalues of q are given by the Rayleigh-Ritz principle: 
A(g,n) = inf{A(g,L) : dimL = n} (14) 



From the inequality (12) we have for each subspace L 

- 6 + (1 - a)\{qo, L) < X{qx, L) < b + (l + a)X{qo, L). (15) 

Taking now the inf over L we get . 

-6+(l-a)A(go,n) <\{qx,n) <b+{l + a)\{qQ,n). (16) 

Since \{qQ,n) — > oo as n ^^ oo, the spectrum of Hx is purely discrete. We 
thus get for any /? > 0, 

gfe/3g-(l-a)A(go,n)/3 > g-A(qx,n)/3 > g-6/3g-{l+a)A(go,n)/3^ 

Taking the sum over n gives the traces: 

e^'I'Tr (e-(i-'^)^o^) > Tr (e"^^^) > e'^^Tr (e-(i+'^)^o^) . 

Soif exp{— (1— a)/?i:fo} is of trace-class for all (1— a)/3 > /So, then exp{— //x/?} 
is of trace class for all j3 > j3x = /?o/(l — a) . □. 

We define the Cramer class at pQ to be the go-bounded forms X] for then 
by Lemma (4) there exists a neighbourhood A^ of zero such that for X £ N, 
exp{— (i^o -l- AX} is of trace-class. 

It follows from lemma (^) that if \\Rq ^Rq II < 0,0 = 1 — Pq, then 
Hx > 0; for we may take 6 = in the KLMN theorem. 



2.2 The First Piece 

We now get the first piece of our manifold. Let T(0) be the real linear space 
of (7o-t)ounded quadratic forms, with domain Qq and norm 

||X||o = py^Xi^y^ll <oo. (17) 

We note that /, the unit operator, lies in T(0). The map A \-^ Hq AHq 
from the set of all bounded Hermitian operators B(7{) onto the set of sym- 
metric go-bounded sesquiforms is an isometry; this shows that T(0) is iso- 
metric to B{7i); in particular, T{0) is complete, and so is a Banach space. 
To each element X of the interior Int7^^(0) of the ball in T(0) of radius 
Oo = 1 — /3o) define the density matrix 



Px 



^~l^-{Ho+X) ^ ^-Ho+X+^x^ ng\ 



The first piece of our manifold is the set ^Ao of such states. We set up the 
patch by mapping A^o bijectively onto the interior of a ball in a Banach 
space; our space T(0), with its ball 7^^(0), will not do, for if we alter X 
by adding a multiple of /, we do not change the state; px = Px+ai, as 
the change in X just leads to an equal and opposite change in Zx, which 
cancels. Conversely, if px = Py fie in Mq, then X — Y is a multiple of /. 
For, as px,PY are faithful states, we may take logarithms: logpx = logpy. 
Then 

Ho + X + ^x = Ho + Y + ^Y 
so 

X -Y = (^y-^x)/- 
Furnish T with an equivalence relation 

X ^Y if X -Y = al for some a G R. (19) 

Then the equivalence classes are lines in T(0) parallel to /. We furnish the 
set T{0)/ ~ of equivalence classes with the topology induced from T. That 
is, an open set in the quotient is the set of all lines passing through some 
open set in T(0). There is then a bijection between the set A^o and the 
subset of this quotient space defined by 

{{X + a/}„6R:||X||o<ao = l-/3o}. (20) 

The bijection is given by 

p^ ^ p^ = {Y £ r(0) :Y = X + al for some a G R}. (21) 

10 



Thus ^Ao becomes topologised, by transfer of structure. It is obviously a 
Hausdorff space. Indeed, it is well known that the quotient topology is 
equivalent to the topology given by the quotient norm ||2l|, pl40. However, 
to construct the patch, we parametrise Aio by a ball in a Banach subspace 



rather than the quotient. In finite dimensions ]14| , p^ , 31, 33| this has been 
done by selecting a point on each line in T, namely, the centred variable 
X = X — pQ.X. The trouble is that we cannot prove that pqX is a sesquiform 
of trace class. We can however find a natural definition for its trace. Suppose 
that X € T(0), and consider the sesquiform p^Xp^ for < A < 1. Choose 
fji G (/3o, 1) and put 5i = \{1 — (3i) and ^2 = A'(l — /3i). From associativity, 
this is equal to 

{pV') {pi'Hr) {RrxRr) (Hrpi^) (,^^^) . m) 

This is an operator of trace class, as we see from the Holder inequality for 
Schatten norms: 

\\ABCDE\\i < P||i/;,||S||oo||C||oo||Z?||oo||^||l/A' (23) 



where A . . . E are the factors bracketted in eq. (|22|). For example, we note 
that 



l^lli/A = I Tr 



A-5i ■ " ' 

Po 



since (3i > (5q. By cyclicity of the trace, its trace is independent of A G 
(0, 1). This needs proving, because of the possible existence of Connes cyclic 
cocycles. Formal differentiation of p^Xp^ gives 

pI[x,HoV^ 

whose trace might be equal to 

Tr (poXFo - H^p^X) = 

by cyclicity; but it is just such expressions that might give non-zero cyclic 
cocycles if not all the operators are bounded. 

We hope to peel off a small part of p^ and put it at the front. This 
would be possible if the remaining factor were of trace class. This is proved 
in the following 

Lemma 5 Suppose that p^ is of trace class for all P > (3q, where < /?o < 1; 
suppose that X is qo bounded. Then 

PoXp^' 
11 



is of trace class for all < X < 1, and its trace is independent of X; here as 
always, A' = 1 — A. 

PROOF. The form is that of a bounded operator, since e.g. pg maps TC into 
Qq. Write 



Po^Po' 



\-5 

Po 



' {pI'hI 



/2 



pl/2^pl/2 



hI'%'Ap^^-'--'pI (24) 



for suitably chosen 5, 81^82 > 0, such that 61 < X and 5 + 82 < A'. The 
product of the three operators in brackets in eq. (|2J) is bounded, by C say 
(but not uniformly in the 5''s). By the Holder inequality, this gives 



Po^Po 



< 



Po 



l/(A-5i) 



.c. 



X'-62-5 

Po 



l//i 



where X — 61 + fj, = 1, i.e. fi = X' + 61. Now, pQ is a positive operator, so 
and 



is finite if 



|„A'-<52-<5|| 
IPO Wl/ii 



{\'-S2-5)/i, 

Po 



is of trace class; the exponent is 

iX'-52-6)/{X' + 6i) 

which can be made larger than /3o by choosing the (5's very small. It follows 
that we can cycle the last factor, p^, to the front without changing the trace, 
thereby increasing A and decreasing A'. We can choose 6 as close to A' — 62 
as we please; it follows that the trace is independent of A' provided that 
X' > 62- Since 82 was arbitrary, we have the result. □ 

We define the regularised mean of X in the state pQ to be 



Po 



.X := Tr IpqXpq 1 , for one and hence all A S (0, 1) 



(25) 



Moreover, pq.X is continuous as a map T(0) — > R. This follows since our 
bound has ||X||o as a factor. The set 



T(0) := {X e T(0) : p^.X = 0} 



(26) 



12 



is a closed linear subspace of T of co dimension 1. The norm || • ||o, restricted 
to T(0), makes T(0) into a Banach space. We map Mq bijectively onto the 
open subset of T(0) given by the points X where the corresponding points in 
T(0)/ ~ (i.e. the lines px parallel to /) intersect it; such a point is unique, 
being given by a = —pQ.X. This bijection is a homeomorphism, since both 
A4o and T(0) have been given the topology induced from T(0). The map 
px 1-^ X is our first chart and its inverse is our first patch. As usual in 
the construction of Banach manifolds, we identify the tangent space at the 
origin of this chart with the space T{0) itself. In this identification, the 
tangent of a curve of the form 

{p(X) = e-(//o+AX+^,^) . ^ g [_^^^]| 

is identified with X = X — pq.X. We see from the picture that our patch 
is the "shadow" of ^^-/^^(O) onto the hyperplane '^"(0), and that it contains 
the ball '7i_^„(0) and in its turn is contained in the open set 

{yeTi(0):||y||o<l + |po.i^|}. 

We note that in finite dimensions, the set of operators parallel to / is 
orthogonal to the hyperplane T(0) when the space T(0) is furnished with the 
BKM vnetxic. We seem to need more regularity than we have at present if the 
SiTM metric is to be finite in infinite dimensions. Obviously, gxiX-, I) can be 
defined when one of the operators is the unit operator, and gxiX-, I) = Px-Y- 
Thus the subspaces T{X) are all orthogonal to the space parallel to /. 

3 Analysis in the First Patch 

So far we have a manifold Mq with one patch. Before enlarging the manifold 
by the addition of more patches, we do some analysis. 

First, it is clear that all states in A^o have finite entropy and regularised 
mean energy, which are related by 

S{p^) = -Trp^ logp^ = p^.H^ + ^x. (27) 

For px.Hx = Tr Ip]^^ (px^x) } which is finite for (5 < 1 - /Jq. 

Lemma 6 Let A he a closed operator and B he a hounded operator such 
that BH QD{A). Then C = AB is bounded. 
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PROOF. 

We note that D{C) = TC, so by the closed graph theorem it is enough to 
show that C is closed. For this, let V'n — > V' aiid suppose that Cipn converges. 
Then we must show that ip G D{C) and Cip = limC'i/'n- The first is true, as 
D{C) = TC; for the second, we see that B^pn -^ Bip, as B is bounded, and 
A{Btpn) converges. Since A is closed, we conclude that Btp G D(A) (already 
known) and Cipn = A{Bipn) -^ ABip = Cip. □ 



Lemma 7 Let X e Mo, Ro = Hq^ and Rx = H^^. Then R^' H 



l/2rjl/2 



Rx Hq are bounded. 



X 



and 



>l/2 



PROOF. 

Since Hq > I, we see that Rq^ is bounded and maps H into D{Hq^) 
Qq = D{Hx ), and H^ is closed. So by lemma (^, C = H^ Rq is 

1/2 1/2 

bounded; its adjoint, the closure of Rq H^ , is therefore also bounded. We 
find 



ri/2N 



c*c = rI^^HxR, 



,1/2 



1 + Hq a Hq , 



so 



(i-||x||o)/<c*c;<(i + ||x||o)/. 



1 /2 1 /2 

Thus the inverse of C, namely Hq R^ , is bounded by (1 



\X\ 



-1/2 



D 



Lemma 8 Let X and Y lie in TWq ^'^t? put qx = qo + X on Qq. Then Y is 
qx -bounded. 

PROOF. 

As sesquiforms, we have 



r^yrT 



pl/2 1/2 1/2 1/2 1/2 1/2 

Rx -^0 ^0 ^ Ro ^0 Rx 



< 

< oo 



pl/2 1^1/2 

Rx ^0 



\Y\ 



j^l/2pl/2 

^0 Rx 



by lemma @. 

We now come to the very useful Duhamel formula for forms. 



D 



Theorem 9 Let X be a symmetric form, q^- small, and let Hx be the self- 
adjoint operator with form qo + X. Then 



.Ho 



.Hx 



-XHoXe-^'Hx^X 



(28) 
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where the r.h.s. means the limit of J^ as e and 5 converge to zero of the 
given sesquiform evaluated at any il) , ip ^ Ti y. TL . 



PROOF. 

Consider the family of operators 

F{\) 



g-Ai/og-(l-A)/^x 



(29) 



where < A < 1. These are of trace-class, since we can apply Holder's 
inequality with parameters 1/A and 1/A'. For any ilj,(p £ Ti we define 



F^A^) 



-AH, 



"V',e 



-(l-X)Hx 



^). 



(30) 



Since e~^^° maps 7i into D{Hq) C Qq, and e"^'-^^ maps H into D{Hx) Q 
Qo, we see that F^^^ is differentiable, and 



^i^^,^(A) 



Integrating from to 1 gives the theorem. 






a 



Lemma 10 Suppose that X, Y are q^-bounded forms, and that the qo-bound 
ofY is a < Go- Then p^Xpy is of trace class for < A < 1. 



PROOF: We can write 



PoXp^ 



Po (PO ^0 



pl/2^pl/2 



rrl/2r^l/2 



J^Y Py j Py 



where 6 £ (0, 1) will be chosen soon, and 6' = 1 — 5. The terms in brackets 
are all bounded, so the operator norm of their product is bounded by C 
say; this can grow as A approaches its limits and 1. We now use Holder's 
inequality for traces, to get 



PoXp^ 



(where p 



which is finite if 



< c 


Po 


1/(A5') 


pf 


= X' + 6X) 


< C\\po\\f 


pys'/. 


1 


X'(l-5)/(X 

Y 


'+SX) 


1 


< oc 


. 





1/m 
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Given A we can choose 6 so small that 






>I3^= /3o/(l - a) 



since the latter is less than 1. □ 

This does not show that the integral converges in trace norm; for the 
trace norm of the integrand might become unbounded at the ends, and 
cannot be shown to be integrable over [0, 1]. However, the trace, as opposed 
to the trace-norm, does converge. 



Lemma 11 Let X,Y be qQ-small, Y having bound less than qq 
Then Tv p^Xpy is bounded for < A < 1. 



l-/3o. 



PROOF. 

We first note that we can use the cyclicity of the trace to take out a factor 
Pq on the left of the expression; this is a bounded operator and the re- 
maining product is, as above, still of trace class. We can therefore permute 
these two factors under the trace. We make use of this when A > 1/2. If 
A < 1/2 we take out a part of the power of the state py from the right and 
put it on the left under the trace; we now illustrate the method by doing 
this case. 



Tiip^Xp^' 



Since A' > 1/2, we have 



Tr{(pf/^<)(<i7o^/>o^ 



pl/2^pl/2 



1^1/2 pl/2 

Hq Ky 



<pf/^)pf'} 



Pf/^i^^/^ 



<snp{x'/^e-'^/'}<C/6'/' 



This bound occurs twice. The other factors in brackets are bounded opera- 
tors with norm bounded by Ci, independent of A and 6. By Holder, 



Trfpo'^p'' 



Now choose 1 — 5 > P^ independent of A. This gives a bound on the trace 
independent of A G (0,1/2). Similarly, in the region A G (1/2,1) we get a 
bound on the trace independent of A. □ 



CiC^5-^ 


pI 


l/A 


ff 


C25'\l. 


pV 


A' 

1 
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Corollary 12 We have 

Tre-^o -Tre"-^^ = / Tr (e~^-^oXe"^'^^) dA. (31) 

By inserting normalising factors we convert the exponentials into states, 
and by specialising to the case X = Y we show that the integrand is a 
bounded function of A in (0, 1). It follows that the integral of the trace is 
absolutely convergent, and the trace is the sum of the diagonal elements in 
any orthonormal basis. The trace and the / can be exchanged, by Fubini's 
theorem. □ 

We have obtained an estimate for the perturbation from the state pq; 
Now Hx inherits the properties of Hq, at least if we replace it by Hx + I- 
We have shown that if Y obeys 

\\Y\\x:=\\R][^YR]l^\\<a, (32) 

then Y G TWq is (/x-small; if X is chosen small enough in this norm, and 
depending on X, y is also small enough, we may replace Hq by Hx and 
Hx by Hx+Y in these estimates. This may also be done in lemma (|^; this 
shows 

Lemma 13 ||y||x o'^t? ||y||o are equivalent norms. 

For, 

FlU - W^x ^0 R-o ^ R-o ^0 R-x II 

< W^xtiQ II llJ^llo 

and the inequality in the other direction is similar. This equivalence is the 
key to the extension of the manifold to other patches. 
We see from the resulting estimate 

Zx - Zx+Y = Tre"-^^ - Tre-^^+^ < C\\Y\\x (33) 

that the partition function Zx is a Lipschitz function of X € Mq- 



4 AfRne Geometry in M. 







By an affine structure for a manifold M.q we mean a rule for forming the 
convex mixture "A/>i + A'/J2" (0 < A < 1;/9i,/92 € M.q). An affine space is 
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a space with a specified affine structure; it is necessarily convex. The unit 
ball ^^(O) in T(0) is a convex subset of a Banach space and so has a natural 
affine structure coming from the linear structure. By 'transfer of structure', 
the chart X ^^ px from 71(0) to M.q provides Mq with the induced affine 
structure. This is called the canonical or (+l)-affine structure. Clearly, 
"A/9X + X'py" = Pxx+X'Y which differs from the usual mixture of states 
P = Xpx + X'pY- The latter is called the mixture or (— l)-affine structure of 
the state space. While it is obvious that Mq is (+l)-convex, it is not clear 
that it is (— l)-convex. That is, while the (— l)-mixture p of pi and p2 is 
certainly a state, it might not lie in M.q even if pi,p2 G A^q. 

4.1 The (+l)-afRne connection 

Let 7i and T2 be affine spaces; then an afhne map U : Ti ^ T2 is a map 
obeying 

U (Api + (1 - A)p2) = XUpi + (1 - X)Up2 (34) 

for all pi,P2 €E 71 and < A < 1. An affine connection on a Banach 
manifold is an assignment, for each continuous curve L from pi to p2, of 
an affine map Ul from the tangent space at pi to the tangent space at p2, 
obeying UlUl' = f/iuL'; the map U for the empty path 0, when pi = p2, is 
the identity, and the symbol LU L' denotes the path L followed by the path 
L'; if L' is the path L with reversed parameter, we take LU L' = 0. These 
axioms ensure that Ul is an invertible map for any L. An affine connection 
is linear if Ul{0) = for every curve L; a linear connection is torsion- free. If 
Ul is independent of L then the connection is called flat, or curvature-free. 
The commonly used formulation is the infinitesimal version of the above, 
obtained by differentiating, if the structure is smooth. 

In order to define the (-l-l)-affine connection concretely, we first put 
coordinates on the tangent space at any X G Mo- We have seen that po-Y 
is continuous in 1" G Mq. Since Hx inherits all relevant properties of Hq, 
we obtain a similar estimate |px-^| ^ const ||y||x- The set 

f{X):={Y:px.Y = 0} (35) 

is therefore a closed subspace of 7"(0) in the equivalent topology defined 
by II • ||x- We identify the tangent space at px to be the Banach space 
T{X) with the norm ||l^||x- We then take the (-l-l)-parallel transport Ul 
oi Y — pQ.Y G 7'(0) along any path L in the manifold to be the point 
Y — px-Y G T{X). This map takes y = to zero, and extends to a 
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linear mapping from T(0) onto T{X). We see that parallel transport is 
nothing other than the moving of the representative point in the line p from 
one hyperplane to the other. Since this transport is independent of L and 
linear, the connection is flat and torsion-free. 

5 Extension of the Manifold 

We see two ways of extending the manifold by gluing new patches. The first 
is to try to include as many go-small perturbations X as possible, and not 
just those obeying ||X||o < Oq; recall that this condition is sufficient for X to 
have go-bound less than ao < 1. The second, and main, method of extension, 
is to use any point px in the first patch, and to consider perturbations Y of 
Hx with ||l^||x < Ox = 1 — /3jf . This might lead out of A^o; we can continue 
indefinitely, starting at Hy etc. In this way we include eventually a state of 
any temperature, and the manifold points generally in the +Hq direction. 

Suppose then that X is symmetric and go-small enough. Then there 
exists a self-adjoint operator Hq whose form is qo + X with form-domain Qq 
and 

l-^(V') V')! ^ Q^ (9o(V') V') -I" ^llV'lP/o) for some a < Oq. (36) 

Let Hq = Hq -\ — /; this is self-adjoint on D{Hq) and 

Let Rq = Hq . Then for ip & Qq, we have for ip G Tl, 

Ry^XRl'\ij, iP) < a{R^/^7p, HoRI'\) = a||Vf , (37) 

so 



1^115^= 



Kq XKq 



< a < tto- 



We have thus shown that for any go-small form X with bound < Oq there 
exists a choice of Hamiltonian Hq such that px lies inside the open ball 
||X||q < Cq. Let us furnish T(0) with this norm; it is equivalent to the norm 

||X||o, since Rq Hq and Rq Hq are both bounded. We can therefore 
add the patches got in this way to the first patch, to get a Banach manifold. 
We can enlarge the manifold even further, by analogy with the classical case 
[p3| . Let X be a go-small form; it therefore defines a unique pair of self- 
adjoint operators, H±, with forms q± := qq ± X; we include px in the first 
patch if for each choice of sign, p± = exp —H± is of trace class. For such an 
X the quantum analogue of the Luxemburg norm is finite. 
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Definition 14 We put 

\\X\\l = inf {r > : TV [(exp -(Fq + X/r) + exp -(Fq - X/r)) /(2Zo)] - 1< 1} 

For large r, X/r is q^o-small, and so the traces make sense; since Zx is 
continuous in X/r if it is small, the set in the infimum is non empty. As r 
becomes smaller, either the operator H± fails to be unique, or the finiteness 
of the trace might fail; in either case we put the trace equal to oo, and the 
corresponding r is a lower bound for ||X||/,. It can be shown that ||X||i is 
a seminorm. 

The second extension of the manifold is to construct a similar chart 
around a state px as we did around po, where X is go-small, with bound 
< Oo- Since the choice of Hq was anyway arbitrary provided Hq > I, we 
drop the tilde; so we consider X G A^o- Choose Hx > I- This Hamiltonian 
inherits all the properties of Hq. Let Y be (/x-small with bound < a^; then 
there is a unique self-adjoint operator Hx+y, whose form domain is Qq, and 
whose form is qo + X + Y, such that 



PX+Y 



^x+ye-^-+^ (38) 



is of trace-class, and px-Y can be defined as Tr ipx^Px)- ^^^ ^(^) be 
the Banach space of forms Y such that ||y||x < oo, with this norm. Since 
II • llo and II • \\x are equivalent norms, this space is actually the same as 
T(0) as a set. The interior of this ball in Tx consists of certain qx-suiall 
forms which are gg-bounded but might not lie in 7^1(0). Let Aix be the set 
of states of the form eq. (|38[). Again, two forms that differ by a multiple of 
/ yield the same state, and there is a bijection between A4x and the set of 
lines {px+y} in '^x parallel to / that cut the open ball in T{X) of radius 
1 — /3x- The set of such lines is furnished with the quotient topology. Let 
f{X) be the (closed) hyperplane {T £ T{X) : px-Y = 0}. Each hue px+Y 
cuts this hyperplane in a unique point, and those in the neighbourhood of 
y = cut the plane inside our open ball Ta^{X). This gives a chart from an 
open set in Mx onto this ball. Again, the tangent space at px is identified 
with T{X). The (+l)-affine structure in A4x is that induced from the linear 
structure of T{X). We can enlarge this piece of the manifold to include all 
gx-small perturbations Y such that exp—{HQ + X + Y} is of trace class, 
and can cover the enlarged set of states by consistent overlapping charts, in 
the same way as for the first method of extension of A^o- 

The next step in building the manifold is to consider the union of Mq and 
A4x- The two charts are topologically compatible, in that in the overlap 
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A4o U A^x the two norms || • ||o and \\ • \\x induced by the charts are 



equivalent; see lemma (|13D. The (+l)-affine structure of AAq and A4x are 
the same on their overlap, since both are induced by the linear structure 
of T{0). Our choice of parallel transport with the first patch reflects this, 
and can be extended in stages to a transport between any two points in the 
union of the pieces. Indeed, let Xi, X2 lie in T(0), so their means in pQ are 
zero. Put Z = XXi + X'X2, and let U denote the parallel transport from po 
to px- Then 

UXi = Xi — px-Xi, i = 1,2, and UZ = Z — px-Z 
Then 

[/(aXi + A'Xs) = Z-px.Z 

= XXi + A'Xa - PX ■ (aXi + A'Xa) 
= XUXi + X'UX2. 

That is, U takes the convex mixture in T(0) to that in T(X). Thus the union 
of the first two pieces is a Banach manifold furnished with a flat torsion-free 
affine structure and the (-l-l)-parallel transport U. 

We can extend further, to a third piece, starting from a different point 
X' in A4o or from a point in A4x outside A^o- In either case we arrive at 
a Qo-bounded form with domain Qq, and a third piece of the manifold with 
a chart into an open ball of the Banach space {Y : px'-Y = 0}, with norm 
II • llx' equivalent to the norms already defined. We continue by induction, 
starting at any point in the manifold obtained already, to get to any go- 
bounded form that can be arrived at in a finite number of steps. At each 
stage, starting from px we enlarge the ball of radius Ox by the first method, 
to include all qx-svciaW forms which define a state. Moreover, suppose we 
arrive at two far points, Hq + X and Hq + Y , which however lie in each 
other's patch. When neither X nor Y is go-small (but are go-bounded), we 
can by construction find a finite chain Xi , X2 ■ ■ ■ from X to Hq and another 
finite chain Yi,Y2 . . . from Hq to Y, each small relative to the last; then 
Rj[ YRJ- is a finite product 

pl/2„l/2 pl/2„l/2 0-1/2 pl/2 0-1/2 pl/2 T?l/2o- pl/2 pl/2 

^X ^Xi ^Xi ^Xi ■■■^0 ^Yi ^Yi ^Y2 ■■■ ^Y„ ^yny^ ...K^ 

which is bounded. Thus ||^||x is finite. Similarly ||AC||y is finite. Thus 
when we then extend to all states obtainable in this way in a finite number 
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of steps, all the norms of any overlapping region are equivalent. With each 
enlargement, we define the (+l)-affine structure and parallel transport in 
stages from chart to chart, to give a flat torsion-free connection. 

Definition 15 The information manifold A4 defined by Hq consists of all 
states obtainable in a finite number of steps, by extending from A^q ^U either 
the first method or the second method, as explained above. 

The Cramer class of each /? G A^ is the set of g'o-bounded forms. 

The question now arises, when we add perturbations Xi, . . . ,Xn and 
Yi, . . . , Ym as above, and Xi + . . . + X„ = Yi + . . . + Y^ as forms (on Qq), 
whether we arrive at the same state whichever route we take. We do, since 
there is a unique self-adjoint operator defined by the form 

qo + Xi + ... + Xr, = qo + Yi + ...+Y^ 

with form domain Qq. 

We now have a natural result. 

Theorem 16 A4 is {+l)-convex. 

PROOF. We first prove the result when /?o = 0. Then the only condition 
on the size of a perturbation Y of Hx is that it be g^^. -small. In this case it 
is obvious that the manifold is a cone. 

Let A4(0) denote the set of states px where X is go-small. Then we 
define M{n) inductively by 

Definition 17 px G Ai{n) if there exists pY G M{n — 1) such that X — Yis 
qy -small, where qy = q^ + Yi + . . . + Yn~i, and each addition Yj is small 
relative to qj-i- 

We show that M.{0) is (+l)-convex, and that if A4{n — 1) is (+l)-convex, 
so is 7V4. 

Suppose then that Xi G ^Ao, 1 = 1,2. Then for ip G Qo, 

l2 



1^1(^,^)1 < ai(7o(^,^)+6i| 
\X2iip,^P)\ < a2qoiip,i^)+b2\ 

Then 

\{XXi + X'X2){iP,ij)\ < X\Xl{^l^,^l^)\+X'\X2{^l^,n 

< X [aiqoii;, V') + 6i ||Vf ) + A' (a2go(V', V') + &2||V'f ) 

< aqoi^P,^P)+b\m^ 
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where a = max{ai,a2} < 1 and h = max{6i,62}- Hence \Xi + X'X2 is 
go-small, and 7V4(0) is convex. 

Now let M.{n) be obtained from M.{n — 1) as in the definition ([TsD. 
So let qy be of the form qQ + Y where py G 7V((n — 1), and let X be gy- 
small. Then px+y ^ -^ ('^) > aiid any element of Ai (n) is of this form. Let 
Pi,P2 £ M.{n); then there exist Yi,Y2 such that py^jpy, ^ A^(?^ — 1), and 
writing qi = go + ^i and q2 = qo + Y2, then there exist forms Xi, X2 such that 
Xi is gi-small and X2 is g2-small, and pi, p2 are the states corresponding to 
qi + Xi and q2 + X2. Let 5 = Agi + \'q2', the state corresponding to q lies in 
7\4(n — 1), since this is (+l)-convex, by the inductive hypothesis. A simple 
estimate shows that \Xi + \'X2 is g-small, so that the state corresponding 
to q + \Xi + \'X2 lies in M{n). But the latter is \{qi+Xi)+ A' (92 + -^^2), 
whose corresponding state is the (+1) mixture of pi and p2- This shows 
that M.{n) is (+l)-convex. 

Now relax the condition that (3q = 0, and define the part M.{n) to be the 
set of states obtained from M.{n — \) as above, but using only sufficiently 
small perturbations. Then all the conclusions derived above remain true, up 
to the result that XXi + X'X2 is g-small. Thus q + XXi + X'X2 is the form 
of a self-adjoint operator that is bounded below; call this operator H. Now, 
by the convexity of Zx, exp —H is of trace class, since pi and p2 are. The 
same is true if we replace Xi by —Xi; hence Z~^ exp —X lies in M.{n), by 
the first method of extension. □ 

We have not been able to prove that the manifold is (— l)-convex; if pi 
and p2 are density operators in the first patch, then obviously, p := Xpi+X' p2 
is a density operator. All we can show from the operator convexity of — log x 
[^, ^, is that — log/9 = Hq + X, where X has go-bound 1; but we need the 
bound to be less than 1 for p to lie inside the first patch. 

6 Outlook 

We have defined a Banach manifold, with the flat torsion-free (+l)-connection. 
The canonical variables at po, are the centred go-bounded forms X, with the 
norm || • ||o. These are (+l)-affine coordinates, and the manifold is a convex 
set when expressed in terms of these. The Massieu function ^ is a continuous 
convex function on the manifold. We can therefore construct the Legendre 
transform using Fenchel's theory, to obtain the 'mixture' variables py-X at 
any point py in the manifold. The entropy is a continuous function. With 
more regularity, we have been able to show that the BKM metric is finite at 
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regular points, and is the Frechet derivative of py-X, as in the classical and 
finite-dimensional cases. Moreover, the free energy is real-analytic. This 
work, [^, ^ which extends [^, will be published elsewhere. 
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